Skip to content
This repository was archived by the owner on Oct 13, 2025. It is now read-only.

Conversation

@dermatologist
Copy link
Owner

No description provided.

@dermatologist dermatologist requested a review from Copilot May 5, 2025 01:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR merges the 'feature/cluster-2' branch to enhance data ingestion, add clustering and topic modelling functionality, and refactor ML operations to use PyTorch.

  • Updated file reading to support single file, folder, and URL inputs with optional ignore-words filtering.
  • Introduced the new ClusterDocs module for document clustering and topic visualizations.
  • Refactored MLQRMine to replace Keras-based neural nets with a PyTorch implementation.

Reviewed Changes

Copilot reviewed 26 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/qrmine/readfiles.py Updated file input handling with support for txt/pdf files, folders, and URLs.
src/qrmine/nlp_qrmine.py Minor adjustments updating parameter usage.
src/qrmine/mlqrmine.py Transitioned from Keras to a PyTorch-based neural network implementation.
src/qrmine/main.py Extended CLI options to incorporate new clustering and visualization features.
src/qrmine/content.py Updated content initialization to support language and processing enhancements.
src/qrmine/cluster.py Introduced new clustering module providing LDA topic models and document clustering.
pyproject.toml & workflows Revised dependency definitions and CI/CD configuration updates.
Files not reviewed (4)
  • dev-requirements.in: Language not supported
  • dev-requirements.txt: Language not supported
  • setup.cfg: Language not supported
  • src/qrmine/resources/df_dominant_topic.csv: Language not supported
Comments suppressed due to low confidence (1)

src/qrmine/readfiles.py:42

  • Avoid using the built-in name 'input' as a parameter. Consider renaming it to 'source' or 'input_path' for clarity.
def read_file(self, input, comma_separated_ignore_words=None):

@dermatologist dermatologist merged commit 715b60e into develop May 7, 2025
3 checks passed
@dermatologist dermatologist deleted the feature/PR-ver-4 branch May 7, 2025 19:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants